e1071 package SVM

Support Vector Machine Tutorial

Install the package
install.packages("e1071",repos="http://cran.rstudio.com/")
## 
## The downloaded binary packages are in
##  /var/folders/3z/jqczpc_95yq_sbgl2665kg2c0000gq/T//RtmpnSbDBK/downloaded_packages
library(e1071)
Check Data Sample

-There’s a data sample about how to classify 3 kinds of Iris (a kind of flower).
-Iris: Setosa Iris, Versicolor Iris, Virginica Iris. (150 rows,50 for each)
-Features: Sepal.Length, Sepal.Width, Petal.Length, Petal.Width

#quick check
iris
##     Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
## 1            5.1         3.5          1.4         0.2     setosa
## 2            4.9         3.0          1.4         0.2     setosa
## 3            4.7         3.2          1.3         0.2     setosa
## 4            4.6         3.1          1.5         0.2     setosa
.
.
.
## 148          6.5         3.0          5.2         2.0  virginica
## 149          6.2         3.4          5.4         2.3  virginica
## 150          5.9         3.0          5.1         1.8  virginica
Data Preview

-Using 2D plot to check the relationship between two different features. There should be C42=6 plots. -Here only show one sample about how to make the plot.

#preview
i1<-as.numeric(iris$Species)
i1[i1=="setosa"]<-1
i1[i1=="versicolor"]<-2
i1[i1=="virginica"]<-3
plot(iris$Sepal.Length,iris$Petal.Width,xlab="the length of Sepal",ylab="the width of petal",main="IRIS",pch=i1,col=i1)
legend("bottomright",c("setosa","versicolor","virginica"),pch=c(1,2,3),col=c(1,2,3))

Seperate Data to trainning and test samples
set.seed(1234)
all <- sample(2,nrow(iris),replace=TRUE,prob=c(0.7,0.3))
train <- iris[all==1,]
test <- iris[all==2,]
train
##     Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
## 1            5.1         3.5          1.4         0.2     setosa
## 2            4.9         3.0          1.4         0.2     setosa
## 3            4.7         3.2          1.3         0.2     setosa
## 4            4.6         3.1          1.5         0.2     setosa
.
.
.
## 145          6.7         3.3          5.7         2.5  virginica
## 146          6.7         3.0          5.2         2.3  virginica
## 148          6.5         3.0          5.2         2.0  virginica
## 150          5.9         3.0          5.1         1.8  virginica
test
##     Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
## 5            5.0         3.6          1.4         0.2     setosa
## 14           4.3         3.0          1.1         0.1     setosa
## 16           5.7         4.4          1.5         0.4     setosa
## 26           5.0         3.0          1.6         0.2     setosa
## 28           5.2         3.5          1.5         0.2     setosa
.
.
.
## 137          6.3         3.4          5.6         2.4  virginica
## 140          6.9         3.1          5.4         2.1  virginica
## 142          6.9         3.1          5.1         2.3  virginica
## 147          6.3         2.5          5.0         1.9  virginica
## 149          6.2         3.4          5.4         2.3  virginica

6.SVM Classifier

-Using C-classification (C-classification,nu-classification,one-classification (for novelty detection),eps-regression,nu-regression) -Cost: C for slack variable -kernel:radial (linear,polynomial,radial basis,sigmoid)

svm <- svm(train[,1:4],train[,5],type="C-classification",cost=10,kernel='radial')
pred<-predict(svm,test[,1:4],decision.values=TRUE)
table(pred,test[,5])
##             
## pred         setosa versicolor virginica
##   setosa         10          0         0
##   versicolor      0         12         2
##   virginica       0          0        14

-According to the plot in 3, linear kernal can present better result than radial basis kernal.

svm <- svm(train[,1:4],train[,5],type="C-classification",cost=10,kernel='linear')
pred<-predict(svm,test[,1:4],decision.values=TRUE)
table(pred,test[,5])
##             
## pred         setosa versicolor virginica
##   setosa         10          0         0
##   versicolor      0         12         0
##   virginica       0          0        16